-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: message-id in postprocessor/gelf-chunking #2662
base: main
Are you sure you want to change the base?
Conversation
7954549
to
ba74fab
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2662 +/- ##
=======================================
Coverage 91.22% 91.22%
=======================================
Files 309 309
Lines 60078 60083 +5
=======================================
+ Hits 54805 54812 +7
+ Misses 5273 5271 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 3 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 looks reasonable nothing to prevent using this I see, great documentation too :)
two things I notice.
-
it would be nice to mention how message id's are generated in the docs (the
//!
section) -
if you are up for a challenge: we try to keep all time and randomness out of tremor to allow for deterministic replays. We do this by using
ingest_ns
for as a random seed, and for times, that way a event that is logged with it's ingest ns can be replayed and generate the exact same out put. The random function is a good example. It'd be interesting to see this same concept re-used for this to allow repeatable yet still random message id's; one way would be to use ingest-ns instead of the current epoch (which would be nice anyway as looking uo time isn't fast), and then seed the RNG somehow (probably not with the ingest ns as that would make it useless) but perhaps with the fist n bytes of the message? or with some bytes of a hash of the message?
(How it will break server-side due to collision? So message-id is a way of determining if the UDP packet is associated with already existing log or it's for a new log, when we're sending same message-id for multiple logs then server behaviour will be to merge the data-together which will end-up breaking the log) TBH I am also trying to figure this out, please share any suggestions, am I thinking right? 😅 |
ba74fab
to
3e4f5a1
Compare
Ja just the message content would not work, I'm still considering if message content + ingest_ns (nanosecond when the message was registered at tremor) would be enough, if a server produces the same log twice in the same nanosecond that'd be very odd (but not impossible) OTOH having two random generated numbers be the same is also odd (but not impossible) it would also one a more deterministic failure case "When messages with the same content arrive at exactly the same time they will get duplicated message ids" instead of "if the RNG hates you, you'll get duplicated message ids" |
Pull request
Description
Changed message-id from auto-increment ID to randomized ID in postprocessor/gelf_chunking.rs
HELP NEEDED: I have avoided adding hostname while producing message-id, I am not sure how can we handle adding hostname, please suggest.
Related
Checklist
Performance